4 research outputs found
Noise Corruption of Empirical Mode Decomposition and Its Effect on Instantaneous Frequency
Huang's Empirical Mode Decomposition (EMD) is an algorithm for analyzing
nonstationary data that provides a localized time-frequency representation by
decomposing the data into adaptively defined modes. EMD can be used to estimate
a signal's instantaneous frequency (IF) but suffers from poor performance in
the presence of noise. To produce a meaningful IF, each mode of the
decomposition must be nearly monochromatic, a condition that is not guaranteed
by the algorithm and fails to be met when the signal is corrupted by noise. In
this work, the extraction of modes containing both signal and noise is
identified as the cause of poor IF estimation. The specific mechanism by which
such "transition" modes are extracted is detailed and builds on the observation
of Flandrin and Goncalves that EMD acts in a filter bank manner when analyzing
pure noise. The mechanism is shown to be dependent on spectral leak between
modes and the phase of the underlying signal. These ideas are developed through
the use of simple signals and are tested on a synthetic seismic waveform.Comment: 28 pages, 19 figures. High quality color figures available on Daniel
Kaslovsky's website: http://amath.colorado.edu/student/kaslovsk
Non-Asymptotic Analysis of Tangent Space Perturbation
Constructing an efficient parameterization of a large, noisy data set of
points lying close to a smooth manifold in high dimension remains a fundamental
problem. One approach consists in recovering a local parameterization using the
local tangent plane. Principal component analysis (PCA) is often the tool of
choice, as it returns an optimal basis in the case of noise-free samples from a
linear subspace. To process noisy data samples from a nonlinear manifold, PCA
must be applied locally, at a scale small enough such that the manifold is
approximately linear, but at a scale large enough such that structure may be
discerned from noise. Using eigenspace perturbation theory and non-asymptotic
random matrix theory, we study the stability of the subspace estimated by PCA
as a function of scale, and bound (with high probability) the angle it forms
with the true tangent space. By adaptively selecting the scale that minimizes
this bound, our analysis reveals an appropriate scale for local tangent plane
recovery. We also introduce a geometric uncertainty principle quantifying the
limits of noise-curvature perturbation for stable recovery. With the purpose of
providing perturbation bounds that can be used in practice, we propose plug-in
estimates that make it possible to directly apply the theoretical results to
real data sets.Comment: 53 pages. Revised manuscript with new content addressing application
of results to real data set
Recommended from our members
Geometric Sparsity in High Dimension
While typically complex and high-dimensional, modern data sets often have a concise underlying structure. This thesis explores the sparsity inherent in the geometric structure of many high-dimensional data sets.
Constructing an efficient parametrization of a large data set of points lying close to a smooth manifold in high dimension remains a fundamental problem. One approach, guided by geometry, consists in recovering a local parametrization (a chart) using the local tangent plane. In practice, the data are noisy and the estimation of a low-dimensional tangent plane in high dimension becomes ill posed. Principal component analysis (PCA) is often the tool of choice, as it returns an optimal basis in the case of noise-free samples from a linear subspace. To process noisy data, PCA must be applied locally, at a scale small enough such that the manifold is approximately linear, but at a scale large enough such that structure may be discerned from noise.
We present an approach that uses the geometry of the data to guide our definition of locality, discovering the optimal balance of this noise-curvature trade-off. Using eigenspace perturbation theory, we study the stability of the subspace estimated by PCA as a function of scale, and bound (with high probability) the angle it forms with the true tangent space. By adaptively selecting the scale that minimizes this bound, our analysis reveals the optimal scale for local tangent plane recovery. Additionally, we are able to accurately and efficiently estimate the curvature of the local neighborhood, and we introduce a geometric uncertainty principle quantifying the limits of noise-curvature perturbation for tangent plane recovery. An algorithm for partitioning a noisy data set is then studied, yielding an appropriate scale for practical tangent plane estimation.
Next, we study the interaction of sparsity, scale, and noise from a signal decomposition perspective. Empirical Mode Decomposition is a time-frequency analysis tool for nonstationary data that adaptively defines modes based on the intrinsic frequency scales of a signal. A novel understanding of the scales at which noise corrupts the otherwise sparse frequency decomposition is presented. The thesis concludes with a discussion of future work, including applications to image processing and the continued development of sparse representation from a geometric perspective